I, Me, Mine: The Role of Personal Phrases in Author Profiling
نویسندگان
چکیده
The Author Profiling (AP) task aims to distinguish between groups of authors labeled by a common demographic characteristic such as gender or age by studying the language usage. In this work we studied the role of personal phrases (i.e., sentences containing first person pronouns) for the AP task. We support the idea that people better expose their personal interests and writing style when they talk about themselves and, consequently, that words near to a personal pronoun reveal valuable information for the classification of authors. The evaluation using different social media data showed that phrases containing singular first person pronouns are highly valuable for predicting the age and gender of users. Considering only these phrases we obtained reductions of up to 60% of the information in the user documents and a comparable classification performance than using all available data. In addition, the results obtained by personal phrases considerably outperformed those from non-personal sentences, indicating their greater suitability for the AP task. We consider these findings could be further applied in the design of strategies for the construction of AP corpora, novel feature selection methods, as well as new feature and instance weighting schemes.
منابع مشابه
Self and Others
more intelligible parts deal, vividly at times, with exchanges between people. A number of passages show valuable insight and at times the author writes clearly, simply and with great power. At other times the content evades understanding; or else it consists of poetic sequences of words. Consider the following: 'My selfbeing, my consciousness and feeling of myself, that taste of myself, of I a...
متن کاملA Document Weighted Approach for Gender and Age Prediction Based on Term Weight Measure
Author profiling is a text classification technique, which is used to predict the profiles of unknown text by analyzing their writing styles. Author profiles are the characteristics of the authors like gender, age, nativity language, country and educational background. The existing approaches for Author Profiling suffered from problems like high dimensionality of features and fail to capture th...
متن کاملP-66: The Impact of Infertility on Pschological and Social Status of Women in Iran: A Content Analysis Study
s:7477:"Background: To explore the Impact of Infertility on Pschological and Social Status of Women in Iran Materials and Methods: Design A qualitative design, based on the content analysis approach, was employed for data collection and analysis of the experiences of Iranian women on infertility. Qualitative studies are intended to enhance understanding and describe the world of human experienc...
متن کاملThe peculiarities of art space worldview in I. S. Shmelev’s novel “The Lord’s Summer”
The article looks into the global literary perspective presented in I. S. Shmelev’s novel “The Lord’s Summer” through the author’s special organization of the time scene, the characters’ language or speech system and genre specific features as well. In particular, it is noted that, through spatial reference points the author's system of values is expres...
متن کاملNeighborhood-user profiling based on perception relationship in the micro-blog scenario
In the micro-blog scenario, personal user profiling relying on content is limited for recommending desired diverse subjects due to its shortcomings of short text, often leading to a poor recall. Currently, many methods only utilized the personal knowledge from each individual user to represent user profile without considering the neighborhood information. However, resource information related t...
متن کامل